.. code:: ipython3 import json import os import pandas as pd import shutil from seeq import spy # Set the compatibility option so that you maximize the chance that SPy will remain compatible with your notebook/script spy.options.compatibility = 192 .. code:: ipython3 # Log into Seeq Server if you're not using Seeq Data Lab: spy.login(url='http://localhost:34216', credentials_file='../credentials.key', force=False) Workbook Jobs ============= In :doc:`spy.workbooks.ipynb `, you can learn to push and pull workbooks (Workbench Analyses and Organizer Topics) to/from the Seeq service/server using SPy. You may need to do something “in bulk,” in one of the following scenarios: - Re-mapping references (e.g. historian tags/signals) from one datasource to another, or one asset tree to another - Transferring work from one Seeq service/server to another, possibly including data The set of functions in the ``spy.workbooks.job`` module are suitable for this work. Each function operates within a “job folder” that captures the state of the job. Unlike ``spy.workbooks.pull()`` and ``spy.workbooks.push()``, the equivalent commands in ``spy.workbooks.job`` do not require all workbooks to be held in memory. This allows very large jobs to be executed (as long as there is sufficient disk space). All parts of the process are *resumable*, and SPy will pick up where it left off if the operation is interrupted for any reason (e.g. a network error). This notebook will walk through the use of this module, referencing the scenarios above. In general, commands are executed in the following order: 1. ``spy.workbooks.job.pull()`` 2. ``spy.workbooks.job.data.pull()`` (optional) 3. ``spy.workbooks.job.push()`` 4. ``spy.workbooks.job.data.push()`` (optional) Establish the Job Folder ------------------------ The parameter that defines a job is a *job folder*. It is the first argument for all job functions, and it is managed entirely by SPy. The folder is laid out in an intuitive way that allows you to inspect it, and, in some troubleshooting cases, make modifications yourself. .. code:: ipython3 job_folder = 'Output/My First Workbooks Job' # Remove the job folder so that old file/artifacts don't affect the tutorial if os.path.exists(job_folder): shutil.rmtree(job_folder) Let’s Make Something to Work With… ---------------------------------- We need some Analyses/Topics to work with for the purposes of demonstrating the functionality, so let’s make sure the example workbooks have been pushed. .. code:: ipython3 example_workbooks = spy.workbooks.load('Support Files/Example Export.zip') spy.workbooks.push(example_workbooks, path='SPy Documentation Examples >> Workbook Job Import', label=f'{spy.session.user.name} Workbook Job Example', refresh=False, errors='raise') Pulling Workbooks ----------------- Start the job cycle by issuing the ``spy.workbooks.job.pull()`` to grab a set of workbooks and write them to disk. As with ``spy.workbooks.pull()``, we create a DataFrame full of workbooks to pull by using the ``spy.workbooks.search()`` function. Then we can supply that DataFrame to ``spy.workbooks.job.pull()``, which takes many of the same parameters as ``spy.workbooks.pull()``. .. code:: ipython3 workbooks_df = spy.workbooks.search({ 'Path': 'SPy Documentation Examples >> Workbook Job Import' }) # Store these in variables that we'll use later example_analysis_workbook_id = workbooks_df[workbooks_df['Name'] == 'Example Analysis'].iloc[0]['ID'] example_topic_workbook_id = workbooks_df[workbooks_df['Name'] == 'Example Topic'].iloc[0]['ID'] workbooks_df .. code:: ipython3 spy.workbooks.job.pull(job_folder, workbooks_df) As mentioned earlier, jobs are *resumable*. If you execute the above cell again, you will see that the **Result** column indicates ``Already pulled``. If you would like to force a job to redo its work, supply the ``resume=False`` argument. You can also inspect the job folder’s ``Workbooks`` subfolder and selectively delete workbook folders therein to force SPy to re-pull workbooks. Pushing Workbooks ----------------- As mentioned above, there are two primary scenarios where you want to push workbooks in bulk: - Re-mapping references (e.g. historian tags/signals) from one datasource to another, or one asset tree to another - Transferring work from one Seeq service/server to another, possibly including data Datasource Maps ~~~~~~~~~~~~~~~ In either case, it’s important to understand the concept of *datasource maps*. These are JSON files that contain instructions for SPy as it maps the identifiers in the pulled workbook definitions to identifiers on the target system. These maps can incorporate relatively complex Regular Expression specifications that allow you to re-orient workbooks from one set of input data to another. The ``spy.workbooks.job.pull()`` command will create a ``Datasource Maps`` folder inside the job folder. There will be one file for every datasource that was encountered during the pull operation – if a workbook touched a datasource in some way, there will be a file for it. Here’s what a typical file looks like: :: { "Datasource Class": "Time Series CSV Files", "Datasource ID": "Example Data", "Datasource Name": "Example Data", "Item-Level Map Files": [], "RegEx-Based Maps": [ { "Old": { "Type": "(?.*)", "Datasource Class": "Time Series CSV Files", "Datasource Name": "Example Data", "Data ID": "(?.*)" }, "New": { "Type": "${type}", "Datasource Class": "Time Series CSV Files", "Datasource Name": "Example Data", "Data ID": "${data_id}" } } ] } You can make modifications to these files by loading them into an editor, including Jupyter’s text editor. Generally the most common action is to add or change entries in the ``RegEx-Based Maps`` block. That section is a *list* of *dictionaries* that each have an ``Old`` and a ``New`` subsection. Within the ``Old`` block, you can specify properties to match on. The key is the property name and the value is a `regular expression `__, often employing a *capture group*. In the example above, the ``Data ID`` field is matching using the ``.*`` regex and storing it in a capture group called ``data_id``. The ``New`` block then contains the properties and values to search upon to “map” to target items. In the example above, the ``"Data ID": "${data_id}"`` specification just means that the Data ID is being used “as-is” without any alteration. (If you happen to be familiar with `Connector Property Transforms `__, this regex approach may feel familiar.) Let’s look at a more complicated example: :: { "Datasource Class": "Time Series CSV Files", "Datasource ID": "Example Data", "Datasource Name": "Example Data", "Item-Level Map Files": [], "RegEx-Based Maps": [ { "Old": { "Type": "(?.*)", "Datasource Class": "Time Series CSV Files", "Datasource Name": "Example Data", "Data ID": "(?.*)" }, "New": { "Type": "${type}", "Datasource Class": "Time Series CSV Files", "Datasource Name": "Example Data", "Data ID": "${data_id}" } }, { "Old": { "Type": "(?.*)", "Path": "Example >> Cooling Tower 1", "Asset": "Area (?[ABC])", "Name": "(?.*)" }, "New": { "Type": "${type}", "Path": "Example >> Cooling Tower 2", "Asset": "Area ${subarea}", "Name": "${name}" } } ] } In this example there are two RegEx-Based Maps specified. The first map is identical to the previous example, and it will be used first– if there is not a match on the ``Old`` regex specifications, then SPy will move on to the next. The next map matches on a particular asset path (``Example >> Cooling Tower 1``) and a set of subareas (``A``, ``B``, or ``C``) and then maps them to the same area underneath ``Example >> Cooling Tower 2``. In this manner, you can use arbitrarily-complex mapping logic to accomplish the goal of re-mapping a workbook within the same Seeq server or properly mapping from one Seeq server to another. Datasource Mapping in Action ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Let’s run through an actual mapping scenario to see how it works and how to troubleshoot it when it goes wrong. First we have to grab a couple of signal IDs so that we can use them later to illustrate some functionality. .. code:: ipython3 area_a_temperature_id = spy.search({'Datasource Name': 'Example Data', 'Name': 'Area A_Temperature'}).iloc[0]['ID'] area_a_optimizer_id = spy.search({'Datasource Name': 'Example Data', 'Name': 'Area A_Optimizer'}).iloc[0]['ID'] Now we will write out a datasource map file that has, as its first map, a ``New`` block that will map to a ``Name`` that does not exist. This will let us see what happens both when the mapping is successful and when there are errors. .. code:: ipython3 datasource_map = { "Datasource Class": "Time Series CSV Files", "Datasource ID": "Example Data", "Datasource Name": "Example Data", "Item-Level Map Files": [], "RegEx-Based Maps": [ { "Old": { "Type": "(?.*)", "Datasource Class": "Time Series CSV Files", "Datasource Name": "Example Data", "Name": "Area A_Optimizer" }, "New": { "Type": "${type}", "Datasource Class": "Time Series CSV Files", "Datasource Name": "Example Data", "Name": "Area NonExistent_Optimizer" }, # In this contrived example, if we match on the "Old" criteria, we don't want to continue to the next regex map "On Match": "Stop" }, { "Old": { "Type": "(?.*)", "Datasource Class": "Time Series CSV Files", "Datasource Name": "Example Data", "Data ID": "(?.*)" }, "New": { "Type": "${type}", "Datasource Class": "Time Series CSV Files", "Datasource Name": "Example Data", "Data ID": "${data_id}" } } ] } with open(os.path.join(job_folder, 'Datasource Maps', 'Datasource_Map_Time Series CSV Files_Example Data_Example Data.json'), 'w') as f: json.dump(datasource_map, f) Now we push to server using a label that is guaranteed to differentiate our activity from other users. As you will see, there will be errors reported in the ``Results`` column because one item won’t be mapped. .. code:: ipython3 push_df = spy.workbooks.job.push(job_folder, path='SPy Documentation Examples >> Workbook Jobs', label=f'{spy.session.user.name} Workbook Job Example', errors='catalog') You can see the error, but it’s not formatted very well. So let’s use a troubleshooting tool– the “explain” function on the returned DataFrame: .. code:: ipython3 print(push_df.spy.item_map.explain(area_a_optimizer_id)) This detailed explanation is intended to give you a starting point for troubleshooting. You can see the **regex** that was specified, the property values that were **matched on**, the **Capture groups** that resulted from the RegEx specifications, and the property values that were subsequently **searched for**. Since ``Area NonExistent_Optimizer`` does not exist, the explanation for ``RegEx-Based Map 0`` says *Item not found on server*. Now let’s look at the explanation for a successful map (``Area A_Temperature``): .. code:: ipython3 print(push_df.spy.item_map.explain(area_a_temperature_id)) Dummy Items ~~~~~~~~~~~ In cases where we couldn’t map successfully, we can tell SPy to create “dummy” items. A dummy item is a signal, condition or scalar that has all the properties of the original item but has no data. (We’ll show how to push data to dummy items later…) Note the use of ``create_dummy_items=True``, and also note ``resume=False`` so that SPy tries to push the workbooks again: .. code:: ipython3 push_df = spy.workbooks.job.push( job_folder, path='SPy Documentation Examples >> Workbook Jobs', label=f'{spy.session.user.name} Workbook Job Example', create_dummy_items=True, resume=False, errors='catalog') Now there are no errors, because any item that couldn’t be mapped would be replaced by a dummy item. Find the ``Example Analysis`` row of the output table above and click on the *link* in the ``URL`` column to take a look at the resulting. You’ll see that the **Details Pane** worksheet contains **Area A_Optimizer**, which is a blank “dummy item”. We can look at what dummy items were created by inspecting ``push_df.spy.item_map.dummy_items``. You can see that the ``Name`` is the same as the original and important properties like ``Maximum Interpolation`` have made their way to the dummy item. .. code:: ipython3 push_df.spy.item_map.dummy_items Including Data -------------- Dummy items are helpful, but they are “blank,” they do not have any data associated with them. If you are transferring workbooks between servers and the destination server doesn’t have access to the same datasources, it is useful to be able transfer the data itself from the source server to the dummy items on the destination server. A set of SPy functions is provided in ``spy.workbooks.job.data`` for this purpose. As ``spy.workbooks.job.pull()`` pulls workbook information, it also tracks the usage of data items on Workbench Worksheets and in Organizer Topic Documents. This information is collated and saved to disk as the *data manifest*. You can inspect the manifest like so: .. code:: ipython3 manifest_df = spy.workbooks.job.data.manifest(job_folder) # Simplify the DataFrame so that it fits on the screen better manifest_df[['ID', 'Path', 'Asset', 'Name', 'Start', 'End', 'Calculation']] You can see ``Start`` and ``End`` columns that provide the overall time bounds that were detected in the workbook data references. There is also a ``Calculation`` column that refines this broad time period into specific “chunks” of data, defined as individual capsules and using the ``within()`` and ``touches()`` Seeq Formula functions to pull data only for those time periods – not just everything between ``Start`` and ``End``. This manifest DataFrame can be fed directly into ``spy.pull()`` but it is recommended that you use ``spy.workbooks.job.data.pull()`` like so: .. code:: ipython3 spy.workbooks.job.data.pull(job_folder) Data has now been added to the job folder for the time periods identified in the manifest and can be pushed to the dummy items like so: .. code:: ipython3 spy.workbooks.job.data.push(job_folder) If you refresh the page of the **Details Pane** within the **Example Analysis**, you’ll now see data for **Area A_Optimizer**. However, if you move the Display Range to the left, you’ll see that **Area A_Optimizer** data only exists for the time period that was originally on the screen. If you want to expand how much data is pulled for a particular item, you can execute a command to alter the manifest and then pull/push again: .. code:: ipython3 # Expand the time periods for Area A_Optimizer by 2 weeks on either side spy.workbooks.job.data.expand(job_folder, {'Name': 'Area A_Optimizer'}, by='2w') spy.workbooks.job.data.pull(job_folder, resume=False) spy.workbooks.job.data.push(job_folder, resume=False) There are a series of functions to alter the manifest, and - ``spy.workbooks.job.data.expand()`` - Expand the existing time periods. - ``spy.workbooks.job.data.add()`` - Add a specific time period. - ``spy.workbooks.job.data.remove()`` - Remove a specific time period. - ``spy.workbooks.job.data.calculation()`` - Apply a specific calculation, such as resample(). Documentation for these functions is found under the **Detailed Help** section below. Redoing Specific Workbooks/Items -------------------------------- In the process of pushing and pulling, you will usually use the ``errors='catalog'`` flag, which means that errors will be enumerated but the operation will keep going if at all possible. When you resume an operation, those items that had errors will not be re-attempted, because SPy (by default) assumes that you don’t care about them. But you will often care about errors, and you will figure out how to fix them (say, by altering a Datasource Map). You can force SPy to redo a push or pull operation for a particular item or set of items using the ``redo`` family of functions: .. code:: ipython3 spy.workbooks.job.redo(job_folder, example_analysis_workbook_id) .. code:: ipython3 spy.workbooks.job.data.redo(job_folder, area_a_optimizer_id) Zip/Unzip the Job Folder ------------------------ If you are intending to transfer workbook information to another Seeq server, it is convenient to package up the job folder as a zip file. There are two functions for this purpose: .. code:: ipython3 spy.workbooks.job.zip(job_folder, overwrite=True) .. code:: ipython3 spy.workbooks.job.unzip(job_folder + '.zip', overwrite=True) Detailed Help ------------- All SPy functions have detailed documentation to help you use them. Just execute ``help(spy.)`` like you see below. **Make sure you re-execute the cells below to see the latest documentation. It otherwise might be from an earlier version of SPy.** .. code:: ipython3 help(spy.workbooks.job.pull) .. code:: ipython3 help(spy.workbooks.job.push) .. code:: ipython3 help(spy.workbooks.job.data.pull) .. code:: ipython3 help(spy.workbooks.job.data.manifest) help(spy.workbooks.job.data.expand) help(spy.workbooks.job.data.add) help(spy.workbooks.job.data.remove) help(spy.workbooks.job.data.calculation) .. code:: ipython3 help(spy.workbooks.job.data.push) .. code:: ipython3 help(spy.workbooks.job.redo) help(spy.workbooks.job.data.redo)